Multi-sensor fusion for robust autonomous flight in indoor and outdoor environments with a rotorcraft micro-aerial vehicle (mav)

ABSTRACT

The subject matter described herein includes a modular and extensible approach to integrate noisy measurements from multiple heterogeneous sensors that yield either absolute or relative observations at different and varying time intervals, and to provide smooth and globally consistent estimates of position in real time for autonomous flight. We describe the development of the algorithms and software architecture for a new 1.9 kg MAV platform equipped with an IMU, laser scanner, stereo cameras, pressure altimeter, magnetometer, and a GPS receiver, in which the state estimation and control are performed onboard on an Intel NUC 3 rd  generation i3 processor. We illustrate the robustness of our framework in large-scale, indoor-outdoor autonomous aerial navigation experiments involving traversals of over 440 meters at average speeds of 1.5 m/s with winds around 10 mph while entering and exiting buildings.

PRIORITY CLAIM

This application claims the benefit of U.S. Provisional Application Ser.No. 61/910,022, filed Nov. 27, 2013, the disclosure of which isincorporated herein by reference in its entirety.

GOVERNMENT INTEREST

This invention was made with government support under Grant Nos.N00014-07-1-0829, N00014-08-1-0696, N00014-09-1-1031, andN00014-09-1-1051 awarded by the Office of Naval Research, Grant Nos.1138847, 113830, and 1138110 awarded by the National Science Foundation,Grant Nos. W911NF-08-2-0004 and W911NF-10-2-0016 awarded by the AirForce Office of Scientific Research, and Grant No. FA9550-10-1-0567awarded by the Army Research Laboratory. The government has certainrights in the invention.

TECHNICAL FIELD

The subject matter described herein relates to controlling autonomousflight in a micro-aerial vehicle. More particularly, the subject matterdescribed herein relates to multi-sensor fusion for robust autonomousflight in indoor and outdoor environments with a rotorcraft micro-aerialvehicle (MAV).

BACKGROUND

Micro-aerial vehicles, such as rotorcraft micro-aerial vehicles, arecapable of flying autonomously. Accurate autonomous flight can beachieved provided that there is sufficient sensor data available toprovide control input for the autonomous flight. For example, in someoutdoor environments where a global positioning system (GPS) isavailable, autonomous flight can be achieved based on GPS signals.However, in environments where GPS is not available, such as indoorenvironments and even outdoor urban environments, autonomous flightbased on GPS alone is not possible. In some indoor environments,magnetometer output may not be available or reliable due to magneticinterference caused by structures. Thus, reliance on a single modalityof sensor to control flight of a rotorcraft MAV may not be desirable.

Another goal of controlling autonomous flight of a rotorcraft MAV issmooth transition between states when a sensor modality that was notpreviously available becomes available. For example, when a rotorcraftMAV is flying indoors where GPS is not available and then transitions toan outdoor environment where GPS suddenly becomes available, therotorcraft may determine that it is far off course and may attempt tocorrect the error by immediately moving to be on course. It is desirablethat such transitions be smooth, rather than having the rotorcraftimmediately make large changes in velocity and trajectory to get back oncourse.

Multiple types of sensor data are available to control autonomous flightin rotorcraft micro-aerial vehicles. For example, onboard cameras, laserscanners, GPS transceivers, and accelerometers can provide multipleinputs that are suitable as control inputs for controlling flight.However, as stated above, relying on any one of these sensors fails whenthe assumptions associated with the sensor fails. Because each type ofsensor produces a unique kind of output with a unique level ofuncertainty in its measurement, there exists a need for improvedmethods, systems, and computer readable media for multi-sensor fusionfor robust autonomous flight in indoor and outdoor environments with arotorcraft MAV.

SUMMARY

The subject matter described herein includes a modular and extensibleapproach to integrate noisy measurements from multiple heterogeneoussensors that yield either absolute or relative observations at differentand varying time intervals, and to provide smooth and globallyconsistent estimates of position in real time for autonomous flight. Wedescribe the development of the algorithms and software architecture fora new 1.9 kg MAV platform equipped with an inertial measurement unit(IMU), laser scanner, stereo cameras, pressure altimeter, magnetometer,and a GPS receiver, in which the state estimation and control areperformed onboard on an Intel NUC 3^(rd) generation i3 processor. Weillustrate the robustness of our framework in large-scale,indoor-outdoor autonomous aerial navigation experiments involvingtraversals of over 440 meters at average speeds of 1.5 m/s with windsaround 10 mph while entering and exiting buildings.

The subject matter described herein may be implemented in hardware,software, firmware, or any combination thereof. As such, the terms“function”, “node” or “module” as used herein refer to hardware, whichmay also include software and/or firmware components, for implementingthe feature being described. In one exemplary implementation, thesubject matter described herein may be implemented using a computerreadable medium having stored thereon computer executable instructionsthat when executed by the processor of a computer control the computerto perform steps. Exemplary computer readable media suitable forimplementing the subject matter described herein include non-transitorycomputer-readable media, such as disk memory devices, chip memorydevices, programmable logic devices, and application specific integratedcircuits. In addition, a computer readable medium that implements thesubject matter described herein may be located on a single device orcomputing platform or may be distributed across multiple devices orcomputing platforms.

BRIEF DESCRIPTION OF THE DRAWINGS

The subject matter described herein will now be explained with referenceto the accompanying drawings of which:

FIG. 1 depicts a 1.9 kg MAV platform equipped with an IMU, laserscanner, stereo cameras, pressure altimeter, magnetometer, and GPSreceiver. All the computation is performed onboard on an Intel NUCcomputer with 3^(rd) generation i3 processor;

FIG. 2 depicts delayed, out-of-order measurement with a priority queue.While z₄ arrives before z₂, z₂ is first applied to the filter. z₄ istemporary stored in the queue. z₁ is discarded since it is older thant_(d) from the current state. The covariance is only propagated up tothe time where the most recent measurement is applied to the filter. Thestate is propagated until the most recent IMU input;

FIGS. 3A and 3B illustrate that GPS signal is regained at k=5, resultingin large discrepancies between the measurement z₅ and the state s₅ (FIG.3A). Pose graph SLAM produces a globally consistent graph (FIG. 3B);

FIGS. 4A and 4B illustrate the alternative GPS fusion, the discrepancybetween transformed GPS measurement z₅ and the non-optimized state s₅ isminimized. Fusion of such indirect GPS measurement will lead to a smoothstate estimate (dashed line between s₆ and s₅);

FIGS. 5A and 5B depict that the MAV maneuvers aggressively with amaximum speed of 3.5 m/s (FIG. 5B). The horizontal position alsocompares well with the ground truth with slight drift (FIG. 5A);

FIGS. 6A-6H depict images from the onboard camera (FIGS. 6A-6D) and anexternal camera (FIGS. 6E-6H). Note the vast variety of environments,including open space, trees, complex building structures, and indoorenvironments. We highlight the position of the MAV with a circle. Videosof the experiments are available in the video attachment and athttp://mrsl.grasp.upenn.edu/shaojie/ICRA2014.mp4;

FIG. 7 depicts a vehicle trajectory aligned with satellite imagery.Different colors indicate different combinations of sensing modalities.G=GPS, V=Vision, and L=Laser;

FIG. 8 illustrates sensor availability over time. Note that failuresoccurred to all sensors. This shows that multi-sensor fusion is a mustfor this kind of indoor-outdoor missions;

FIG. 9 illustrates covariance changes as the vehicle flies through adense building area (between 200 s-300 s, top of FIG. 7,). The GPS comesin and out due to building shadowing. The covariance of x, y, and yawincreases as GPS fails and decreases as GPS resumes. Note that the bodyframe velocity are observable regardless of GPS measurements, and thusits covariance remains small The spike in the velocity covariance is dueto the vehicle directly facing the sun. The X-Y covariance is calculatedfrom the Frobenius norm of the covariance submatrix;

FIG. 10 depicts vehicle trajectory overlaid on a satellite map. Thevehicle operates in a tree-lined campus environment, where there is highrisk of GPS failure during operation;

FIGS. 11A and 11B depict onboard (FIG. 11A) and external (FIG. 11B)camera images as the MAV autonomously flies through a tree-lined campusenvironment. Note the nontrivial light condition;

FIG. 12 is a block diagram of a rotorcraft MAV for performingmulti-sensor fusion according to an embodiment of the subject matterdescribed herein;

FIG. 13 is a flow chart illustrating an exemplary process formulti-sensor fusion controlling autonomous of a rotorcraft MAV accordingto an embodiment of the subject matter described herein;

FIG. 14 illustrates an experimental platform with limited onboardcomputation (Intel Atom 1.6 GHz processor) and sensing (two cameras withfisheye lenses and an off-the-shelf inexpensive IMU). The platform massis 740 g;

FIG. 15 illustrates a system architecture with update rates andinformation flow between modules marked;

FIG. 16 illustrates the performance of body frame velocity estimationduring autonomous tracking of the trajectory presented in Sect. VIII-A;

FIGS. 17A-17D illustrate the effects on feature tracking performance dueto fast translation (FIGS. 17A-17B) and fast rotation (FIGS. 17C-17D).The number of tracked features significantly decrease after rotation;

FIGS. 18A and 18B illustrate that a simulated quadrotor tracks a smoothtrajectory generated from a sequence of waypoints. Trajectoryregeneration takes place after a change of waypoints at 20 s;

FIG. 19 illustrates a finite state machine-based approach to MAVnavigation that enables the operator to interact with the vehicle duringexperiments;

FIGS. 20A and 20B illustrate desired, estimated and actual trajectorieswhen the robot is commanded to follow a smooth trajectory generated froma rectangle pattern;

FIG. 21A is a snapshot image of the indoor environment, and FIG. 21B isthe image captured by the onboard camera. Note that the floor isfeatureless, which can pose a challenge to approaches that rely ondownward facing cameras;

FIGS. 22A-22C illustrate maps and estimated positions during the indoornavigation experiment. Note the nontrivial discontinuities in the poseestimates obtained via SLAM after the loop closure (FIG. 22C);

FIG. 23 illustrates a final 3D map and trajectory of the outdoorexperiment after closing the loop; and

FIGS. 24A-24D contain images of autonomous navigation in a complexoutdoor environment. Images from both the external video camera and theonboard camera are shown. Videos of the experiments are available athttp://mrsl.grasp.upenn.edu/shaojie/IROS2013.mov.

DETAILED DESCRIPTION

Rotorcraft micro-aerial vehicles (MAVs) are ideal platforms forsurveillance and search and rescue in confined indoor and outdoorenvironments due to their small size, superior mobility, and hovercapability. In such missions, it is essential that the MAV is capable ofautonomous flight to minimize operator workload. Robust state estimationis critical to autonomous flight especially because of the inherentlyfast dynamics of MAVs. Due to cost and payload constraints, most MAVsare equipped with low cost proprioceptive sensors (e.g. MEMS IMUs) thatare incapable for long term state estimation. As such, exteroceptivesensors, such as GPS, cameras, and laser scanners, are usually fusedwith proprioceptive sensors to improve estimation accuracy. Besides thewell-developed GPS-based navigation technology [1, 2]. There is recentliterature on robust state estimation for autonomous flight inGPS-denied environments using laser scanners [3, 4], monocular camera[5, 6], stereo cameras [7, 81, and RGB-D sensors [9]. However, all theseapproaches rely on a single exteroceptive sensing modality that is onlyfunctional under certain environment conditions. For example,laser-based approaches require structured environments, vision basedapproaches demand sufficient lighting and features, and GPS only worksoutdoors. This makes them prone to failure in large-scale environmentsinvolving indoor-outdoor transitions, in which the environment canchange significantly. It is clear that in such scenarios, multiplemeasurements from GPS, cameras, and lasers may be available, and thefusion of all these measurements yields increased estimator accuracy androbustness. In practice, however, this extra information is eitherignored or used to switch between sensor suites [10].

The main goal of this work is to develop a modular and extensibleapproach to integrate noisy measurements from multiple heterogeneoussensors that yield either absolute or relative observations at differentand varying time intervals, and to provide smooth and globallyconsistent estimates of position in real time for autonomous flight. Thefirst key contribution, that is central to our work, is a principledapproach, building on [11], to fusing relative measurements byaugmenting the vehicle state with copies of previous states to create anaugmented state vector for which consistent estimates are obtained andmaintained using a filtering framework. A second significantcontribution is our Unscented Kalman Filter (UKF) formulation in whichthe propagation and update steps circumvent the difficulties that resultfrom the semi-definiteness of the covariance matrix for the augmentedstate. Finally, we demonstrate results with a new experimental platform(FIG. 1) to illustrate the robustness of our framework in large-scale,indoor-outdoor autonomous aerial navigation experiments involvingtraversals of over 440 meters at average speeds of 1.5 m/s with windsaround 10 mph while entering and exiting two buildings.

Next, we present previous work on which our work is based. In SectionIII we outline the modeling framework before presenting the keycontributions of UKF-based sensor fusion scheme in Section IV. We bringall the ideas together in our description of the experimental platformand the experimental results in Section VI.

II. Previous Work

We are interested in applying constant computation complexityfiltering-based approaches, such as nonlinear variants of the Kalmanfilter, to fuse all available sensor information. We stress thatalthough SLAM-based multi-sensor fusion approaches [12, 13] yieldoptimal results, they are computationally expensive for real-time statefeedback for the purpose of autonomous control.

While it is straightforward to fuse multiple absolute measurements suchas GPS, pressure/laser altimeter in a recursive filtering formulation,the fusion of multiple relative measurements obtained from laser orvisual odometry are more involved. It is common to accumulate therelative measurements with the previous state estimates fuse them aspseudo-absolute measurements [5, 14]. However, such fusion issub-optimal since the resulting global position and yaw covariance isinconsistently small compared to the actual estimation error. Thisviolates the observability properties [6], which suggests that suchglobal quantities are in fact unobservable. As such, we develop ourmethod based on state augmentation techniques [11] to properly accountfor the state uncertainty when applying multiple relative measurementsfrom multiple sensors.

We aim to develop a modular framework that allows easy addition andremoval of sensors with minimum coding and mathematical derivation. Wenote that in the popular EKF-based formulation [5, 8], the computationof Jacobians can be problematic for complex systems like MAVs. As such,we employ a loosely coupled, derivative-free Unscented Kalman Filter(UKF) framework [1]. Switching from EKF to UKF poses several challenges,which will be detailed and addressed in Sect. IV-A. [15] is similar toour work. However, the EKF-based estimator in [15] does not supportfusion of multiple relative measurements.

III. Multi-Sensor System Model

We define vectors in the world and body frames as (•)^(w) and (•)^(b)respectively. For the sake of brevity, we assume that all onboardsensors are calibrated and are attached to the body frame. The mainstate of the MAV is defined as:

x=[p ^(w),φ^(w) ,{dot over (p)} ^(b) ,b _(a) ^(b) ,b _(ω) ^(b) ,b _(z)^(w)]^(T)

where p^(w)=[x^(w), y^(w), z^(w)]^(T) is the 3D position in the worldframe, Φ^(w)=[ψ^(w), θ^(w), φ^(w)]^(T) is the yaw, pitch, and roll Eulerangles that represent the 3-D orientation of the body in the worldframe, from which a matrix Rwb that represent the rotation of a vectorfrom the body frame to the world frame can be obtained. {dot over(p)}^(b) is the 3D velocity in the body frame. b_(a) ^(b) and b_(ω) ^(b)are the bias of the accelerometer and gyroscope, both expressed in thebody frame. b_(z) ^(w) models the bias of the laser and/or pressurealtimeter in the world frame.

We consider an IMU-based state propagation model:

u _(t) =[a ^(b),ω^(b)]^(T)

v _(t) =[v _(a) ,v _(ω) ,v _(b) _(a) ,v _(b) _(a) ,v _(b) _(z) ]^(T)

x _(t+1)=ƒ(x _(t) ,u _(t) ,v _(t))  (1)

where u is the measurement of the body frame linear accelerations andangular velocities from the IMU. v^(t)˜N(0,D^(t))ε

^(m) is the process noise. v_(a) and v_(ω) represent additive noiseassociated with the gyroscope and the accelerometer. v_(ba), v_(bω),v_(bz) model the Gaussian random walk of the gyroscope, accelerometerand altimeter bias. The function f(•) is a discretized version of thecontinuous time dynamical equation [6].

Exteroceptive sensors are usually used to correct the errors in thestate propagation. Following [11], we consider measurements as eitherbeing absolute or relative, depending the nature of underlying sensor.We allow arbitrary number of either absolute or relative measurementmodels.

A. Absolute Measurements

All absolute measurements can be modeled in the form:

z _(t+m) =h _(a)(x _(t+m) n _(t+m))  (2)

where n_(t+m)˜N(0, Q_(t))ε

^(p) is the measurement noise that can be either additive or not.h_(a)(•) is in general a nonlinear function. An absolute measurementconnects the current state with the sensor output. Examples are shown inSect. V-B.

B. Relative Measurements

A relative measurement connects the current and the past states with thesensor output, which can be written as:

z _(t+m) =h _(r)(x _(t+m) ,x _(t) ,n _(t+m))  (3)

The formulation accurately models the nature of odometry-like algorithms(Sect. V-C and Sect. V-D) as odometry measures the incremental changesbetween two time instants of the state. We also note that, in order toavoid temporal drifting, most state-of-the-art laser/visual odometryalgorithms are keyframe based. As such, we allow multiple futuremeasurement (mεM, |M|>1) that corresponds to the same past state x_(t).

IV. UKF-Based Multi-Sensor Fusion

We wish to design a modular sensor-fusion filter that is easilyextensible even for inexperienced users. This means that amount ofcoding and mathematical deviation for the addition/removal of sensorsshould be minimal. One disadvantage of the popular EKF-based filteringframework is the requirement of computing the Jacobian matrices, whichis proven to be difficult and time consuming for a complex MAV system.As such, we employ the derivative-free UKF based approach [1]. The keyof UKF is the approximation of the propagation of Gaussian randomvectors through nonlinear functions via the propagation of sigma points.Let x˜N({circumflex over (x)},P^(xx))ε

^(n) and consider the nonlinear function:

y=g(x),  (4)

and let:

χ=[{circumflex over (x)},{circumflex over (x)}±(√{square root over((n+λ)P ^(xx))})_(i)] for i=1, . . . ,n

[

_(i) =g(χ_(i)),  (5)

where g(•) is a nonlinear function, λ is a UKF parameter. (√{square rootover ((n+λ)P^(xx))})_(i) is the i^(th) column of the square rootcovariance matrix; which is usually computed via Cholesky decomposition.And χ are called the sigma points. The mean, covariance of the randomvector y, and the cross-covariance between x and y, can be approximatedas:

y ^ = ∑ i = 0 2  n   w i m  i   P yy = ∑ i = 0 2  n  w i c  (i - y ^ )  ( i - y ^ ) T   P yy = ∑ i = 0 2  n   w i c  ( - y ^ ) ( χ i - x ^ ) T ( 6 )

where ω_(i) ^(m) and ω_(i) ^(c) are weights for the sigma points. Thisunscented transform can be used to keep track of the covariance in boththe state propagation and measurement update, thus avoiding the need ofJacobian-based covariance approximation.

A. State Augmentation for Multiple Relative Measurements

Since a relative measurement depends both the current and past states,it is a violation of the fundamental assumption in the Kalman filterthat the measurement should only depend on the current state. One way todeal with this is through state augmentation [11], where a copy of thepast state is maintained in the filter. Here we present an extension of[11] to handle arbitrary number of relative measurement models with thepossibility that multiple measurements correspond to the same augmentedstate. Our generic filtering framework allows convenience setup,addition and removal of absolute and relative measurement models.

Note that a measurement may not affect all components in the state x.For example, a visual odometry only affects the 6-DOF (Degree ofFreedom) pose, not the velocity or the bias terms. We define the i^(th)augmented state as x_(i)ε

^(n) ^(i) . n_(i)≦n. x_(i) is an arbitrary subset of x. We define abinary selection matrix B_(i) of size n_(i)×n, such that x_(i)=B_(i)x.Consider a time instant, there are l augmented states in the filter,along with the covariance:

$\begin{matrix}{{\overset{ˇ}{x} = \left\lbrack {\hat{x},{\hat{x}}_{1},{\ldots \mspace{14mu} {\hat{x}}_{I}}} \right\rbrack^{T}}{\overset{ˇ}{P} = \begin{bmatrix}P^{xx} & P^{{xx}_{1}} & \ldots & P^{{xx}_{I}} \\P^{x_{1}x} & P^{x_{1}x_{1}} & \ldots & P^{x_{1}x_{I}} \\\vdots & \vdots & \ddots & \vdots \\P^{x_{I}x} & P^{x_{I}x_{1}} & \ldots & P^{x_{I}x_{I}}\end{bmatrix}}} & (7)\end{matrix}$

The addition of a new augmented state x_(l)+1 can be done by:

$\begin{matrix}{{{\overset{ˇ}{x}}^{+} = {M^{+}\overset{ˇ}{x}}},\mspace{14mu} {M^{+} = \begin{bmatrix}{I_{n} + {\sum_{I}n_{i}}} \\B_{I + 1}\end{bmatrix}}} & (8)\end{matrix}$

Similarly, the removal of an augmented state x_(j) is given as:

${{\overset{ˇ}{x}}^{-} = {M^{-}\overset{ˇ}{x}}},\mspace{14mu} {M^{-} = \begin{bmatrix}I_{a} & O_{a \times n_{j}} & O_{a \times b} \\O_{b \times n} & O_{b \times n_{j}} & I_{b}\end{bmatrix}},$

where a=n+Σ_(i=1) ^(j-1)n_(i) and b=Σ_(i=j+1) ^(l)n_(i). The updatedaugmented state covariance is given as:

{hacek over (P)}±=M±{hacek over (P)}M± ^(T).

The change of keyframes in a odometry-like measurement model is simplythe removal of an augmented state x_(i) followed by the addition ofanother augmented state with the same B_(i). Since we allow multiplerelative measurements that correspond to the same augmented state,contrast to [11], augmented states are not deleted after measurementupdates (Sect. IV-D).

This state augmentation formulation works well in an EKF setting,however, it poses issues when we try to apply it to the UKF. Since theaddition of a new augmented state (8) is essentially a copy of the mainstate. The resulting covariance matrix {hacek over (P)}⁺ will not bepositive definite, and the Cholesky decomposition (5) for statepropagation will fail (non-unique). We now wish to have something thatis similar to the Jacobian matrices for EKF, but without explicitlycomputing the Jacobians.

B. Jacobians for UKF

In [16], the authors present a new interpretation of the UKF as a LinearRegression Kalman Filter (LRKF). In LRKF, we seek to find the optimallinear approximation y=Ax+b+e of the nonlinear function (4) given aweighted discrete (or sigma points (6)) representation of thedistribution N({circumflex over (x)},P^(xx)). The objective is to findthe regression matrix A and vector b that minimize the linearizationerror e:

$\begin{matrix}\min \\{A,b}\end{matrix}{\sum\limits_{i = 0}^{2n}{{w_{i}\left( {_{i} - {A\; _{i}} - b} \right)}{\left( {_{i} - {A\; _{i}} - b} \right)^{T}.}}}$

As shown in [16], the optimal linear regression is given by:

A=P ^(yx) P ^(xx) ⁻¹ ,b=ŷ−A{circumflex over (x)}  (9)

The linear regression matrix A in (9) serves as the linear approximationof the nonlinear function (4). It is similar to the Jacobian in the EKFformulation. As such, the propagation and update steps in UKF can beperformed in a similar fashion as EKF.

C. State Propagation

Observing the fact that during state propagation only the main statechanges, we start off by partitioning the augmented state and thecovariance (7) into:

${{\overset{ˇ}{x}}_{tt} = \begin{bmatrix}{\hat{x}}_{tt} \\{\hat{x}}_{I_{tt}}\end{bmatrix}},{{\overset{ˇ}{P}}_{tt} = {\begin{bmatrix}P_{tt}^{xx} & P_{tt}^{{xx}_{i}} \\P_{tt}^{x_{I}x} & P_{tt}^{x_{I}x_{I}}\end{bmatrix}.}}$

The linear approximation of the nonlinear state propagation (1), appliedon the augmented state (7), is:

$\begin{matrix}\begin{matrix}{{\overset{ˇ}{x}}_{{t + 1}t} = {f\left( {{\overset{ˇ}{x}}_{tt},u_{t},v_{t}} \right)}} \\{{= {{\begin{bmatrix}F_{t} & O \\O & I_{I}\end{bmatrix}{\overset{ˇ}{x}}_{tt}} + {\begin{bmatrix}J_{t} & G_{t} \\O & O\end{bmatrix}\begin{bmatrix}u_{t} \\v_{t}\end{bmatrix}} + b_{t} + e_{t}}},}\end{matrix} & (10)\end{matrix}$

from which we can see that the propagation of the full augmented stateis actually unnecessary since the only nontrivial regression matrixcorresponds to the main state. We can propagate only the main state xvia sigma points generated from P_(t|t) ^(xx) and use the UKF JacobianF_(t) to update the cross covariance P_(t|t) ^(xx) ¹ . Since thecovariance matrix of the main state P_(t|t) ^(xx) is always positivedefinite, we avoid the Cholesky decomposition failure problem.

Since the process noise is not additive, we augment the main state withthe process noise and generate sigma points from:

$\begin{matrix}{{{\overset{\_}{x}}_{tt} = \begin{bmatrix}{\hat{x}}_{tt} \\O\end{bmatrix}},{{\overset{\_}{P}}_{tt} = {\begin{bmatrix}P_{tt}^{xx} & O \\O & D_{t}\end{bmatrix}.}}} & (11)\end{matrix}$

The state is then propagated forward by substituting (11) into (1), (5)and (6). We obtain {circumflex over (x)}_(t+1|t), the estimated value ofx at time t+1 given the measurements up to t, as well as P_(t+1|t) ^(xx)and P_(t+1|t) ^(xx) . Following (9), we know that:

P _(t+1|t) ^(xx) P _(t|t) ⁻¹ =[F _(t) ,G _(t)].

The propagated augmented state and its covariance is updated accordingto (10):

$\begin{matrix}{{{\overset{ˇ}{x}}_{{t + 1}t} = \begin{bmatrix}{\hat{x}}_{{t + 1}t} \\{\hat{x}}_{I_{tt}}\end{bmatrix}},{{\overset{ˇ}{P}}_{{t + 1}t} = {\begin{bmatrix}P_{{t + 1}t}^{xx} & {F_{t}P_{tt}^{{xx}_{I}}} \\{P_{tt}^{x_{I}x}F_{t}^{T}} & P_{tt}^{x_{I}x_{I}}\end{bmatrix}.}}} & (12)\end{matrix}$

D. Measurement Update

Let there be m state propagations between two measurements, and wemaintain {hacek over (x)}_(t+m|t) and {hacek over (P)}_(t+m|t) as thenewest measurement arrives. Consider a relative measurement (3) thatdepends on the j^(th) augmented state, the measurement prediction andits linear regression approximation can be written as:

$\begin{matrix}{{\hat{z}}_{{t + m}t} = {h_{r}\left( {{\hat{x}}_{{t + m}t},{B_{j}^{T}{\hat{x}}_{j_{{t + m}t},n_{t + m}}}} \right)}} \\{= {{H_{{t + m}t}{\overset{ˇ}{x}}_{{t + m}t}} + {L_{t + m}n_{t + m}} + b_{t + m} + e_{t + m}}}\end{matrix}$H_(t + mt) = [T_(t + mt)^(x), O, H_(t + mt)^(x_(j)), O].

Again, since only the main state and one augmented state are involved ineach measurement update, we can construct another augmented statetogether with the possibly non-additive measurement noise:

${{\overset{`}{x}}_{{t + m}t} = \begin{bmatrix}{\hat{x}}_{{t + m}t} \\{\hat{x}}_{j_{{t + m}t}} \\O\end{bmatrix}},{{\overset{`}{P}}_{{t + m}t} = {\begin{bmatrix}P_{{t + m}t}^{xx} & P_{{t + m}t}^{{xx}_{j}} & O \\P_{{t + m}t}^{x_{j}x} & P_{{t + m}t}^{x_{j}x_{j}} & O \\O & O & Q_{t + m}\end{bmatrix}.}}$

After the state propagation (12), {grave over (P)}_(t+m|t) is guaranteedto be positive definite, thus it is safe to perform sigma pointpropagation as in (5) and (6). We obtain {circumflex over (z)}_(t+m|t),P_(t+m|t) ^(zz), P_(t+m|t) ^(z{grave over (x)}), and:

P _(t+m|t) ^(z{grave over (x)}) {grave over (P)} _(t+m|t) ⁻¹ =[H_(t+m|t) ^(x) ,H _(t+m|t) ^(xj) ,L _(t+m)].

We can apply the measurement update similar to an EKF:

{hacek over (K)} _(t+m) ={hacek over (P)} _(t+m|t) H _(t+m|t) ^(T) P_(t+m|t) ^(zz-1)

{hacek over (x)} _(t+m|t+m) ={hacek over (x)} _(t+m|t) +{hacek over (K)}_(t+m)(z _(t+m) −{hacek over (z)} _(t+m|t))

{hacek over (P)} _(t+m|t+m) ={hacek over (P)} _(t+m|t) −{hacek over (K)}_(t+m) H _(t+m|t) {hacek over (P)} _(t+m|t)

where z_(t+m), is the actual sensor measurement. Both the main andaugmented states will be corrected during measurement update. We notethat entries in H_(t+m|t) that correspond to inactive augmented statesare zero. This can be utilized to speed up the matrix multiplication.

The fusion of absolute measurements can simply be done by {circumflexover (x)}_(j) _(t+m|t) = and applying the corresponding absolutemeasurement model (2).

As shown in FIG. 9, fusion of multiple relative measurements results inslow growing, but unbounded covariance in the global position and yaw.This is consistent with results in [6] that these global quantities areunobservable.

E Delayed, Out-of-Order Measurement Update

When fusing multiple measurements, it is possible that the measurementsarrive out-of-order to the filter, that is, a measurement thatcorresponds to an earlier state arrives after the measurement thatcorresponds to a later state. This violates the Markov assumption of theKalman filter. Also, due to the sensor processing delay, measurementsmay run behind the state propagation.

We address these two issues by storing measurements in a priority queue,where the top of the queue corresponds to the oldest measurement. Apre-defined a maximum allowable sensor delay t_(d) of 100 ms was set forour MAV platform. Newly arrived measurements that corresponded to astate older than t_(d) from the current state (generated by statepropagation) are directly discarded. After each state propagation, wecheck the queue and process all measurements in the queue that are olderthan t_(d). The priority queue essentially serves as a measurementreordering mechanism (FIG. 2) for all measurements that are not olderthan t_(d) from the current state. In the filter, we always utilize themost recent IMU measurement to propagate the state forward. We, however,only propagate the covariance on demand. As illustrated in FIG. 2, thecovariance is only propagated from the time of the last measurement tothe current measurement.

F. An Alternative Way for Handling Global Pose Measurements

As the vehicle moves through the environment, global pose measurementsfrom GPS and magnetometer may be available. It is straightforward tofuse the GPS as a global pose measurement and generate the optimal stateestimate. However, this may not be the best for real-world applications.A vehicle that operates in a GPS-denied environment may suffer fromaccumulated drift. When the vehicle gains GPS signal, as illustrated inFIG. 3A, there may be large discrepancies between the GPS measurementand the estimated state (z₅-s₅). Directly applying GPS as globalmeasurements will result in undesirable behaviors in both estimation(large linearization error) and control (sudden pose change).

This is not a new problem and it has been studied for ground vehicles[17] under the term of local frame-based navigation. However, [17]assumes that a reasonably accurate local estimate of the vehicle isalways available (e.g. wheel odometry). This is not the case for MAVssince the state estimate with only the onboard IMUs drifts away vastlywithin a few seconds. The major difference between an IMU and the wheelodometry is that an IMU drifts temporally, but the wheel odometry onlydrifts spatially. However, we have relative exteroceptive sensors thatare able to produce temporally drift-free estimates. As such, we onlyneed to deal with the case that all relative exteroceptive sensors havefailed. Therefore, our goal is to properly transform the global GPSmeasurement into the local frame to bridge the gap between relativesensor failures.

Consider a pose-only graph SLAM formulation with s_(k)=[x_(k) ^(w),y_(k)^(w),ψ_(k) ^(w)]Tεθ being 2D poses. We try find the optimalconfiguration of the pose graph given incremental motion constraintsd_(k) from laser/visual odometry, spatial loop closure constraints1_(k), and absolute pose constraints z_(k) from GPS:

$\begin{matrix}\min \\ \ominus \end{matrix}{\left\{ {{\sum\limits_{k = 1}^{M}{{{{h_{i}\left( {s_{k - 1},d_{k}} \right)} - s_{k}}}p_{k}^{d}}} + {\sum\limits_{k = 1}^{L}{{{{h_{l}\left( {s_{k},l_{k}} \right)} - s_{l{(k)}}}}p_{k}^{l}}} + {\sum\limits_{k = 1}^{N}{{{z_{k} - s_{k}}}p_{k}^{z}}}} \right\}.}$

The optimal pose graph configuration can be found with available solvers[18], as shown in FIG. 3B. The pose graph is disconnected if there areno relative exteroceptive measurements between two nodes. Let two posegraphs be disconnected between k−1 and k.

The pose graph SLAM provides the transformation between thenon-optimized s_(k-1) and the SLAM-optimized s_(k-1) ⁺ state. Thistransform can be utilized to transform the global GPS measurement to bealigned with s_(k-1):

Δ_(t−1) =s _(k-1) ⊖s _(k-1) ⁺

z _(k-1) ⁻=Δ_(t−1) ⊕z _(k-1)

where ⊕ and ⊖ are pose compound operations as defined in [19]. Thecovariance P_(t−1) ^(Δ) of Δ_(t−1) and subsequently the covarianceP_(t−1) ^(z) ⁻ of z_(k-1) ⁻ can be computed following [19]. Thisformulation minimizes the discrepancies between z_(k-1) ⁻ and s_(k-1),and thus maintains smoothness in the state estimate. The transformed GPSz_(k-1) ⁻, is still applied as an absolute measurement to the UKF (FIG.4A and FIG. 4B).

However, despite the large scale in our field experiments (Sect. VI), wehardly find a case that the accumulated drift is large enough to causeissues with direct GPS fusion. In the future, we will seek for evenlarger scale experiments to verify the necessity of the above localframe-based approach.

V. Implementation Details A. Experimental Platform

The experimental platform shown in FIG. 1 is based on the Pelicanquadrotor from Ascending Technologies, GmbH (http://www.asctec.de/).This platform is natively equipped with an AutoPilot board consisting ofan IMU and a user-programmable ARM7 microcontroller. The maincomputation unit onboard is an Intel NUC with a 1.8 GHz Core i3processor with 8 GB of RAM and a 120 GB SSD. The sensor suite includes aublox LEA-6T GPS module, a Hokuyo UTM-30LX LiDAR and twomvBlueFOX-1VILC200 w grayscale HDR cameras with fisheye lenses thatcapture 752×480 images at 25 Hz. We use hardware triggering for framesynchronization. The onboard auto exposure controller is fine tuned toenable fast adaption during rapid light condition changes. A 3-D printedlaser housing redirects some of the laser beams for altitudemeasurement. The total mass of the platform is 1.87 kg. The entirealgorithm is developed in C++ using robot operating system (ROS)(http://www.ros.org) as the interfacing robotics middleware.

B. Absolute Measurements

Some onboard sensors are capable of producing absolute measurements(Sect. 111-A), here are their details:

1) GPS And Magnetometer:

$z_{t} = {\begin{bmatrix}\begin{pmatrix}x_{t}^{w} \\y_{t}^{w}\end{pmatrix} \\{R_{b}^{w}\begin{pmatrix}{\overset{.}{x}}_{t}^{b} \\{\overset{.}{y}}_{t}^{b}\end{pmatrix}} \\\psi_{t}^{w}\end{bmatrix} + {n_{t}.}}$

2) Laser/Pressure Altimeter:

z _(t) =z _(t) ^(w) +b _(zt) ^(w) +n _(t)

3) Pseudo Gravity Vector:

If the MAVs is near hover or moving at approximately constant speed, wemay say that the accelerometer output provides a pseudo measurement ofthe gravity vector. Let g=[0, 0, g]^(T), we have:

z _(t) =R _(b) ^(wT) g ^(w) +b _(at) ^(b) +n _(t).

C. Relative Measurement—Laser-Based Odometry

We utilize the laser-based odometry that we developed in our earlierwork [4]. Observing that man-made indoor environments mostly containsvertical walls, we can make a 2.5-D environment assumption. With thisassumption, we can make use of the onboard roll and pitch estimates toproject the laser scanner onto a common ground plane. As such, 2D scanmatching can be utilized to estimate the incremental horizontal motionof the vehicle. We keep a local map to avoid drifting while hovering.

${z_{t + m} = {{\ominus_{2d}{\begin{bmatrix}x_{t}^{w} \\y_{t}^{w} \\\psi_{t}^{w}\end{bmatrix} \oplus_{2d}\begin{bmatrix}x_{t + m}^{w} \\y_{t + m}^{w} \\\psi_{t + m}^{w}\end{bmatrix}}} + n_{t + m}}},$

where P_(2dt)=[x_(t) ^(w),y_(t) ^(w),ψ_(t) ^(w)]^(T),⊕_(2d) and ⊖_(2d)are the 2-D pose compound operations as defined in [19].

D. Relative Measurement—Visual Odometry

We implemented a classic keyframe-based visual odometry algorithm.Keyframe-based approaches have the benefit of temporally drift-free. Wechoose to use light-weight corner features but run the algorithm at ahigh-rate (25 Hz). Features are tracked across images via KLT tracker.Given a keyframe with a set of triangulated feature points, we run arobust iterative 2D-3D pose estimation [8] to estimate the 6-DOF motionof the vehicle with respect to the keyframe. New keyframes are inserteddepending on the distance traveled and the current number of valid 3Dpoints.

$z_{t + m} = {{\ominus {\begin{bmatrix}P_{t}^{w} \\\Phi_{t}^{w}\end{bmatrix} \oplus \begin{bmatrix}P_{t + m}^{w} \\\Phi_{t + m}^{w}\end{bmatrix}}} + n_{t + m}}$

E. Feedback Control

To achieve stable flight across different environments with possiblylarge orientation changes, we choose to use a position trackingcontroller with a nonlinear error metric [20]. The 100 Hz filter output(Sect. IV) is used directly as the feedback for the controller. In ourimplementation, the attitude controller runs at 1 kHz on the ARMprocessor on the MAV's AutoPilot board, while the position trackingcontrol operates at 100 Hz on the main computer. We implemented bothsetpoint trajectory tracking and velocity control to allow flexibleoperations.

VI. Experimental Results

Multiple experiments are conducted to demonstrate the robustness of oursystem. We begin with an quantitative evaluation in a lab environmentequipped with a motion capture systems. We then test our system in tworeal-world autonomous flight experiments, including an industrialcomplex and a tree-lined campus.

A. Evaluation of Estimator Performance

We would like to push the limits of our onboard estimator. Therefore, wehave a professional pilot to aggressively fly the quadrotor with a 3.5m/s maximum speed and large attitude of up to 40°. The onboard stateestimates are compared the ground truth from the motion capture system.Since there is no GPS measurement indoors, our system relies on a fusionof relative measurements from laser and vision. We do observe occasionallaser failure due to large attitude violating the 2.5-D assumption(Sect. V-C). However, the multi-sensor filter still tracks the vehiclestate throughout (FIG. 5A). We do not quantify the absolute pose errorsince it is unbounded. However, the body frame velocity (FIG. 5B)compares well with the ground truth with standard deviations of {0.1021,0.1185, 0.0755}^(T) (m/s) in x, y, and z, respectively.

B. Autonomous Flight in Large-Scale Indoor and Outdoor Environments

We tested our system in a challenging industrial complex. The testingsite spans a variety of environments, including outdoor open space,densely filled trees, cluttered building area, and indoor environments(FIGS. 6A-6H). The MAV is autonomously controlled using the onboardstate estimates. However, a human operator always has the option ofsending high level waypoints or velocity commands to the vehicle. Thetotal flight time is approximately 8 minutes, and the vehicle travels445 meters with an average speed of 1.5 m/s. As shown in the map-alignedtrajectory (FIG. 7), during the experiment, frequent sensor failuresoccurred (FIG. 8), indicating the necessity of multi-sensor fusion. FIG.9 shows the evolution of covariance as the vehicle flies through a GPSshadowing area. The global x, y and yaw error is bounded by GPSmeasurement, without which the error will grow unbounded. This matchesthe observability analysis results. It should be noted that the error onbody frame velocity does not grow, regardless of the availability ofGPS. The spike in velocity covariance in FIG. 9 is due to the camerafacing direct sunlight.

C. Autonomous Flight in Tree-Lined Campus

We also conduct experiments in a tree-lined campus environment, as shownin FIG. 10. Autonomous flight in this environment is challenging due tonontrivial light condition changes as the vehicle moves in and out oftree shadows. The risk of GPS failure is also very high due to the treesabove the vehicle. Laser-based odometry only works when close tobuildings. The total trajectory length is 281 meters.

FIGS. 11A and 11B depict onboard (FIG. 11A) and external (FIG. 11B)camera images as the MAV autonomously flies through a tree-lined campusenvironment. Note the nontrivial light condition.

VII. Conclusion and Future Work

In this disclosure, we present a modular and extensible approach tointegrate noisy measurements from multiple heterogeneous sensors thatyield either absolute or relative observations at different and varyingtime intervals. Our approach generates high rate state estimates inreal-time for autonomous flight. The proposed approach runs onboard ournew 1.9 kg MAV platform equipped with multiple heterogeneous sensors. Wedemonstrate the robustness of our framework in large-scale, indoor andoutdoor autonomous flight experiments that involves traversal through anindustrial complex and a tree-lined campus.

In the near future, we would like to integrate higher level planning andsituational awareness on our MAV platform to achieve fully autonomousoperation across large-scale complex environments.

FIG. 12 is a block diagram illustrating an MAV for performing fusingmeasurements from sensors that produce both absolute and relativemeasurements according to an embodiment of the subject matter describedherein. Referring to FIG. 12, the MAV 100 includes one or more motors102 for controlling motion of the MAV using one or more rotors 104. Asstated above, in the experiments described herein, the Pelican QuadroRotor available from Ascending Technologies was used. However, otherrotorcraft can be substituted without departing from the scope of thesubject matter described herein. It also includes a controller 106 forcontrolling operation of the motors 102 based on sensor input. Acomputation unit 108 includes a sensor fusion module 110 that fuses themeasurements from multiple sensors and produces an output signal tocontroller 106. In the illustrated example, sensor fusion module 110receives input from IMU 112, pressure altimeter 114, magnetometer 116,laser scanner 118, GPS receiver 120, cameras 122, and pressure altimeter123. Sensor fusion module 110 converts relative measurements, such asthose produced by laser scanner 118 and cameras 122 to measurements thatdepend on augmented states as described above. The transformedmeasurements are combined using the Unscented Kalman Filter describedabove and output to controller 106. The signal provided as output tocontroller 106 serves as feedback to controller 106 for controllingposition, velocity, and acceleration of MAV 100. Controller 106 alsoreceives inputs from a trajectory estimator 124, which estimates thetrajectory of MAV 100 needed to arrive at user-specific waypoints.

FIG. 13 is a flow chart illustrating an exemplary process forcontrolling motion of a rotorcraft MAV using multi-sensor fusionaccording to an embodiment of the subject matter described herein.Referring to FIG. 13, in step 200, input is received from sensors ofmultiple different modalities. For example, computation unit 108 andsensor fusion module 110 may receive input from any one or more of thesensors illustrated in FIG. 12 from which output is available at a giventime. In step 202, relative output measurements produced by some of thesensors that depend on previous states are converted into measurementsthat depend on augmented states. The process of performing suchconversions is described above in Section IV(A). In step 204measurements from the different sensors are combined and filtered. Forexample, the measurements may be combined using an Unscented KalmanFilter. In step 206, the combined measurements are output to atrajectory generator along with a waypoint input by a user. In step 208,the output of the trajectory generator is used to control motion of therotorcraft MAV.

The disclosure of each of the following references is incorporatedherein by reference in its entirety.

-   [1] S. J. Julier and J. K. Uhlmann, “A new extension of the kalman    filter to nonlinear systems,” in Proc. of SPIE, I. Kadar, Ed., vol.    3068, July 1997, pp. 182-193.-   [2] R. V. D. Merwe, E. A. Wan, and S. I. Julier, “Sigma-point kalman    filters for nonlinear estimation: Applications to integrated    navigation,” in Proc. of AIAA Guidance, Navigation, and Controls    Conf., Providence, R.I., August 2004.-   [3] A. Bachrach, S. Prentice, R. He, and N. Roy, “RANGE-robust    autonomous navigation in gps-denied environments,” J. Field    Robotics, vol. 28, no. 5, pp. 644 666, 2011.-   [4] S. Shen, N. Michael, and V. Kumar, “Autonomous multi-floor    indoor navigation with a computationally constrained MAV,” in Proc.    of the IEEE Intl. Conf on Robot. and Autom., Shanghai, China, May    2011, pp. 20-25.-   [5] S. Weiss, M. W. Achtelik, S. Lynen, M. Chli, and R. Siegwart,    “Real-time onboard visual-inertial state estimation and    self-calibration of mays in unknown environments,” in Proc. of the    IEEE Intl. Conf on Robot. and Autom., Saint Paul, Minn., May 2012,    pp. 957-964.-   [6] D. G. Kottas, J. A. Hesch, S. L. Bowman, and S. I. Roumeliotis,    “On the consistency of vision-aided inertial navigation,” in PrOC.    of the Intl. Sym. on Exp. Robot., Quebec, Canada, June 2012.-   [7] F. Fraundorfer, L. Heng, D. Honegger, G. H. Lee, L. Meier, P.    Tanskanen, and M. Pollefeys, “Vision-based autonomous mapping and    exploration using a quadrotor MAV,” in Proc. of the IEEE/RSJ Intl.    Conf on bztell. Robots and Syst., Vilamoura, Algarve, Portugal,    October 2012.-   [8] K. Schmid, T. Tornio, E Ruess, H. Hirsclunuller, and M. Suppa,    “Stereo vision based indoor/outdoor navigation for flying robots,”    in Proc. of the IEEE/RSJ Intl. Cozzi: on Intell. Robots and Syst.,    Tokyo, Japan, November 2013.-   [9] A. S. Huang, A. Bachrach, P. Henry, M. Krainin, D. Maturana, D.    Fox, and N. Roy, “Visual odometry and mapping for autonomous flight    using an RGB-D camera,” in Proc. of the Intl. Spit. of Robot.    Research, Flagstaff, Ariz., August 2011.-   [10]T. Tomic, K. Schmid, P. Lutz, A. Domel, M. Kassecker, E.    Mair, I. L. Grixa, F Ruess, M. Suppa, and D. Burschka, “Autonomous    UAV: Research platform for indoor and outdoor urban search and    rescue,” IEEE Robot. Autom. Mag., vol. 19, no. 3, pp. 46-56, 2012.-   [11]S. I. Roumeliotis and J. W. Burdick, “Stochastic cloning: A    generalized framework for processing relative state measurements,”    in Proc. of the IEEE Intl. Conf on Robot. and Autom., Washington,    D.C., May 2002, pp. 1788-1795.-   [12] J. Carlson, “Mapping large urban environments with GPS-aided    SLAM,” Ph.D. dissertation, CMU, Pittsburgh, Pa., July 2010.-   [13]D. Schleicher, L. M. Bergasa, M. Ocaa, R. Barea, and E. Lopez,    “Real-time hierarchical GPS aided visual SLAM on urban    environments,” in Proc. of the IEEE Intl. Conf. on Robot. and    Autom., Kobe, Japan, May 2009, pp. 4381-4386.-   [14]S. Shen, Y. Mulgaonkar, N. Michael, and V. Kumar, “Vision-based    state estimation and trajectory control towards high-speed flight    with a quadrotor,” in Proc. of Robot.: Sci. and Syst., Berlin,    Germany, 2013.-   [15]S. Lynen, M. W. Achtelik, S. Weiss, M. Chli, and R. Siegwart, “A    robust and modular multi-sensor fusion approach applied to may    navigation,” in Proc. of the IEEE/RSJ Intl. Conf. on Intell. Robots    and Syst., Tokyo, Japan, November 2013.-   [16]T. Lefebvre, H. Bruyninckx, and J. D. Schuller, “Comment on “a    new method for the nonlinear transformation of means and covariances    in filters and estimators”,” IEEE Trans. Autom. Control, vol. 47,    no. 8, pp. 1406-1409, 2002.-   [17] D. C. Moore, A. S. Huang, M. Walter, and E. Olson,    “Simultaneous local and global state estimation for robotic    navigation,” in Proc. of the IEEE Intl. Conf. on Robot. and Autom.,    Kobe, Japan, May 2009, pp. 3794-3799.-   [18]R. Kuemmerle, G. Grisetti, H. Strasdat, K. Konolige, and W.    Burgard, “g2o: A general framework for graph optimizations,” in    Proc. of the IEEE Intl. Conf. on Robot. and Autom., Shanghai, China,    May 2011, pp. 3607-3613.-   [19]R. Smith, M. Self, and P. Cheeseman, “Estimating uncertain    spatial relationships in robotics,” in Proc. of the IEEE Intl. Conf    on Robot. and Autom., vol. 4, Rayleigh, N.C., March 1987, p. 850.-   [20] T. Lee, M. Leoky, and N. McClamroch, “Geometric tracking    control of a quadrotor uav on SE(3),” in Proc. of the Intl. Conf. on    Decision and Control, Atlanta, Ga., December 2010, pp. 5420-5425.

As stated above, an autonomous rotorcraft MAV according to an embodimentof the subject matter described herein may include a trajectorygenerator or estimator 124 for generating a trajectory plan forcontrolling a trajectory of a rotorcraft MAV during flight based on anestimated current state of the rotorcraft MAV and a waypoint input by auser. The following description illustrates trajectory planning that maybe performed by trajectory generator or estimator 124 according to oneembodiment of the subject matter described herein.

Vision-Based Autonomous Navigation in Complex Environments with aQuadrotor

The subject matter described herein includes present a system designthat enables a light-weight quadrotor equipped with only forward-facingcameras and an inexpensive IMU to autonomously navigate and efficientlymap complex environments. We focus on robust integration of the highrate onboard vision-based state estimation and control, the low rateonboard visual SLAM, and online planning and trajectory generationapproaches. Stable tracking of smooth trajectories is achieved underchallenging conditions such as sudden waypoint changes and large scaleloop closure. The performance of the proposed system is demonstrated viaexperiments in complex indoor and outdoor environments.

I. Introduction

Quadrotor micro-aerial vehicles (MAVs) are ideal platforms forsurveillance and search and rescue in confined indoor and outdoorenvironments due to their small size and superior mobility. In suchmissions, it is essential that the quadrotor be autonomous to minimizeoperator workload. In this work, we are interested in pursuing alight-weight, off-the-shelf quadrotor to autonomously navigate complexunknown indoor and outdoor environments using only onboard sensors withthe critical control computations running in real-time onboard therobot.

The problem of autonomous aerial navigation has been studied extensivelyover the past few years. Early works [1]-[3] primarily rely on laserscanners as the main sensor and localize the vehicle in indoorenvironments with structural elements that do not vary greatly along thevertical direction (the 2.5D assumption). Mechanized panning laserscanners that add considerable payload mass are used in [4, 5] for stateestimation. Vision-based approaches, such as those in [6]-[8], rely on adownward-facing camera, a combination of stereo vision and adownward-facing optical flow sensor, and an RGB-D sensor, respectively,to achieve stable autonomous flight in indoor and/or outdoorenvironments. However, these approaches are unable to exploit themobility and maneuverability of the quadrotor platform due to pragmaticconcerns that arise from environment structure assumptions, reducedalgorithm update rates, or the large vehicle size. Moreover, approachesthat rely on downward-facing vision sensors [6, 7] often fail to performrobustly in environments with featureless floors or at low altitudes.

At the other end of the spectrum, there are many successful reactivenavigation approaches that do not rely on metric state estimation [9,10]. Although these approaches enable autonomous flight with lowcomputation power, they fundamentally limit the flight capabilities ofthe MAV when operating in complex environments.

We pursue an autonomous navigation approach that enables the vehicle toestimate its state in an unknown and unstructured environment, map theenvironment, plan in the map, and autonomously control alongtrajectories developed from this plan. Online obstacle detection andreplanning permit operation in static and dynamic environments withaverage flight speeds of more than 1 m/s. At such speeds, a low-latencystate estimation, online smooth trajectory generation, and responsivevehicle control become necessary due to the agility of the platform. Achallenge that arises in pursuit of this goal is the need to ensure thatthe estimated pose remains smooth and consistent, even during loopclosures resulting from simultaneous localization and mapping.Traditionally, loop closure corrections are fused directly into the highrate onboard state estimator. This causes discontinuities in theestimated state, which, especially during rapid maneuvers, can lead tocatastrophic crashes of the quadrotor.

In this work, we address these requirements by proposing a systemarchitecture that employs two forward-facing cameras as the primarysensors, and a novel methodology that maintains estimation smoothnessand control stability during replanning and loop closure, which in turnenables efficient autonomous navigation in complex environments.

II. System Design and Methodology

We begin by providing an overview of the system architecture andmethodology, and the hardware and software components required for ourdesign. Detailed discussion of the major components are given insubsequent sections following the logical flow of the system blockdiagram (FIG. 15).

A. Hardware Platform

The experimental platform (FIG. 14) is based on the Hummingbirdquadrotor from Ascending Technologies (see http:/www.asctec/de). Thisoff-the-shelf platform comes with an AutoPilot board that is equippedwith an inexpensive IMU and a user-programmable ARM7 microcontroller.The high level computer onboard includes an Intel Atom 1.6 GHz processorand 1 GB RAM. Communication between the onboard computer and a groundstation is via 802.11n wireless network. The only new additions to theplatform are two grayscale mvBlueFOX-MLC200 w cameras with hardware HDR.All cameras are equipped with fisheye lenses. The synchronizationbetween cameras and IMU is ensured via hardware triggering. The totalweight of the platform is 740 g.

B. Software Architecture and Methodology

The software architecture is shown in FIG. 15. This architecture allowsus to divide the computations between the onboard low and high levelprocessors and the offboard ground station. On the onboard high levelcomputer, a vision-based estimator provides 6-DOF pose estimates at 20Hz. We employ an unscented Kalman filter (UKF) to fuse pose estimateswith IMU measurements and generate 100 Hz state estimates that aredirectly used as the feedback for the nonlinear tracking controller. Onthe ground station, a stereo-based visual SLAM module generates a 3Dvoxel grid map for the high level planner. The SLAM module also providesglobal pose correction. However, we do not directly fuse this posecorrection with the vision-based state estimate since it may causesignificant pose discontinuities in the event of large scale loopclosures. Instead, we transform the waypoints using the pose correctionsuch that, if the vehicle follows these transformed waypoints, it isstill able to reach the global goal. We further develop a trajectorygenerator that runs onboard the high level computer at 100 Hz to convertthe desired waypoints into smooth polynomial trajectories.

III. Visual-Inertial (VINS) State Estimation

A. Vision-Based Pose Estimation

We use a modification of our earlier work [11] to estimate the 6-DOFpose of the vehicle. Note that although we equip the platform with twocameras, we do not perform traditional stereo-based state estimation. Infact, we set one camera that captures images at 20 Hz as the primarycamera, while the other camera is configured to capture images at 1 Hz.Because we don't perform high rate disparity computations, the requiredcomputational power is reduced. However, the stereo geometry allows usto estimate metric information preserving the scale of the local map andthe pose estimates.

I) Monocular-Based Pose Estimation:

For images captured by the primary fisheye camera, we detect FASTcorners [12] and track them using the KLT tracker [13]. Note that due tothe high frame rate of the primary camera, we are able to performfeature tracking directly on the distorted fisheye images, avoidingadditional computation overhead on image undistortion. We utilize theincremental rotation estimate from short term integration of thegyroscope measurement and perform 2-point RANSAC to reject trackingoutliers. We propose a decoupled orientation and position estimationscheme in order to make use of distant features that are not yettriangulated. The orientation of the robot R_(j) is estimated viaepipolar constraints with look-back history to minimize drifting.Assuming the existence of a perfect 3D local map, which consists oftriangulated 3D features p_(i), iεI, the position of the robot r_(j) canbe found efficiently by solving the following linear system:

$\begin{matrix}{{\left( {\sum\limits_{i \in \mathcal{I}}\frac{_{3} - {u_{ij}^{r}u_{ij}^{rT}}}{d_{i}}} \right)r_{j}} = {\sum\limits_{i \in \mathcal{I}}{\frac{_{3} - {u_{ij}^{r}u_{ij}^{rT}}}{d_{i}}p_{i}}}} & (1)\end{matrix}$

where u_(ij) is the unit length feature observation vector; u_(ij) ^(r)

R_(j)u_(ij); and d_(i)=∥r_(j-1)−p_(i)∥.

Once the 6-DOF pose is found, the location of the feature p_(i) can befound by solving the following linear system:

A _(ij) p _(i) =b _(ij)  (2)

where A_(ij) and b_(ij) represent all observations of the i^(th) featureup to the j^(th) frame. This is a memoryless problem, therefore thecomplexity of feature triangulation is constant regardless of the numberof observations of that particular feature.

2) Stereo-Based Scale Recovery:

The pose estimation approach described above suffers from scale driftdue to the accumulated error in the monocular-based triangulation. Everyinstant stereo measurement is used for scale drift compensation. Let Kdenote the set of features seen by both cameras. We can compute thedifference of the average scene depth as:

$\begin{matrix}{\gamma = {\frac{1}{}{\sum\limits_{k \in }\frac{{p_{k} - r_{j}}}{p_{k}^{s}}}}} & (3)\end{matrix}$

where p_(i) ^(s) is the 3D feature location obtained solely via stereotriangulation. We can then compensate for the drifting of scale bymodifying b_(ij) as (4) and solve (2) again.

$\begin{matrix}{b_{ij}^{+} = {{\frac{1}{\gamma}b_{ij}} - {\frac{1}{\gamma}A_{ij}r_{j}} + {A_{ij}r_{j}}}} & (4)\end{matrix}$

B. UKF-Based Sensor Fusion

The 20 Hz pose estimate from the vision system alone is not sufficientto control the robot. An UKF with delayed measurement compensation isused to estimate the pose and velocity of the robot at 100 Hz [14]. Thesystem state is defined as:

x=[r,{dot over (r)},q,a _(b)]^(t)  (5)

where r is the 3D position of the robot; q is the quaternionrepresentation the 3D orientation of the robot; and a_(b) is the bias ofthe accelerometer measurement in the body frame. We use a conventionalIMU-based process model to propagate the system state, and a linearmeasurement model which consists of the 6-DOF pose for state correction.

C. Performance of the Visual-Inertial State Estimator

FIG. 16 shows the comparison of the performance of the VINS estimatoragainst the ground truth from the Vicon motion capture system² duringautonomous tracking of a predefined trajectory (Sect. VIII-A). Theonboard velocity estimates compare well with the Vicon estimates (alltransformed to the body frame) with standard deviation of {σ_(v) _(x) ,σ_(v) _(y) , σ_(v) _(z) }={0.0500, 0.0706, 0.0309} (m/s). However, thelack of global bundle adjustment of the VINS estimator results in longterm drift in the estimated pose due to recursive formulation. Wetherefore introduce an odometry frame, (r_(j) ^(O),R_(j) ^(O)) torepresent such drifting behavior.

IV. Visual Slam

We implement a visual SLAM module to eliminate the drift in the VINSsystem. Visual SLAM is a widely studied area. In small workspaces,approaches that use recursive filtering [15] or parallel tracking andmapping techniques [16] yield accurate results. Large scale mapping withmonocular [17] or stereo [18] cameras are achieved using posegraph-based formulations. In our system, due to the limited onboardcomputation resources, limited wireless transmission bandwidth, and theaccuracy of the onboard estimator, a high rate visual SLAM is bothunnecessary and infeasible. Therefore, our visual SLAM module runsoffboard with a maximum rate of 1 Hz. A pose graph-based SLAM back-end,together with a front-end that utilize SURF features [19] for widebaseline loop closure detection, yield robust performance at such lowrates. We sparsely sample the estimated robot trajectory to generatenodes for the pose graph. For each node, we compute sparse 3D points bydetecting and matching SURF features between the stereo images. Densedisparity images and dense point clouds are also computed.

We detect loop closures by checking nodes that fall inside theuncertainty ellipsoid of the current node. We check a constant number ofnodes, starting from the earliest candidate, for possible loop closures.SURF features are used to test the similarity between two scenes. Wecompute the relative transform between the current node and the loopclosure candidate using RANSAC PnP [20]. A rigidity test, proposed in(Sect. 3.4, [21]), is performed to verify the geometric consistency ofthe loop closure transform. Candidate transforms that pass the geometricverification are added to the pose graph. Finally, we use the iSAMlibrary for pose graph optimization [22]. Once an optimized pose graphis found, we can construct a 3D voxel grid map by projecting the densepoint cloud to the global frame. This map is used for the high levelplanning (Sect. V) and to enable the human operator to monitor theprogress of the experiment. The optimized pose represents an estimate inthe world frame and is denoted by (r_(j) ^(W), R_(j) ^(W)).

The pose correction from the visual SLAM d_(j) ^(WO), which serves asthe transform between the odometry frame and the world frame, isformulated such that:

(r _(j) ^(W) ,R| _(j) ^(W))=d _(j) ^(WO)⊕(r _(j) ^(O) ,R _(j) ^(O))  (6)

where ⊕ is the pose update function defined in [23]. In contrast totraditional approaches, we do not use (r_(j) ^(W),R_(j) ^(W)) as aglobal pose measurement for correcting the drift in the VINS system.Instead, we feed d_(j) ^(WO), into the trajectory generator (Sect. VI)and compute trajectories that are guaranteed to be smooth even if thereare large discontinuities in the visual-SLAM pose estimate (i.e. ∥d_(j)^(WO)⊖d_(j-1) ^(WO)∥ is large) due to loop closures. This is the majordeparture of our system from existing approaches and it is the key toenable high-speed autonomous navigation in complex environments. Furtherdetails are provided in Sect. VI.

V. High Level Planning

We employ a two-stage planning approach. On a higher level, given theuser-specified waypoints in the world frame, and treating the quadrotoras a cylinder, a high level path that connects the current robotposition and the desired goal, which consists a sequence of desired 3Dpositions and yaw angles, is generated using the RRT* [24] asimplemented in the Open Motion Planning Library (OMPL) [25]. Theresulting path is simplified to a minimum number of K waypoints g_(k)^(W) and is sent to the trajectory generator (Sect VI) for furtherrefinement. The path is checked for possible collisions at the samefrequency as the map update (1 Hz, Sect IV). Although the high levelplanner only requires moderate computational resources, we run itoffboard as all information required for high level planning comes fromthe offboard visual SLAM module. We also allow the user to bypass theplanner and explicitly set a sequence of waypoints.

VI. Trajectory Generation

We first transform all waypoints from the high level planner into theodometry frame using the latest pose correction from the visual SLAM(6):

g _(k) ^(O) =⊖d _(j) ^(WO) ⊕g _(k) ^(W)  (7)

If the robot flies through all transformed waypoints using the stateestimate in the odometry frame for feedback control, it will also flythrough the same sets of waypoints in the world frame. Moreover, itthere are large scale loop closures (i.e. large changes in d_(j) ^(WO)),the set of waypoints that the robot is heading towards will changesignificantly. However, if we are able to regenerate smooth trajectorieswith initial conditions equal to the current state of the robot, thetransition between trajectories will be smooth and no special handlingis needed within the onboard state estimator and the controller.

We wish to ensure that the quadrotor smoothly passes through allwaypoints, while at the same time maintaining a reliable state estimate.A crucial condition that determines the quality of the vision-basedestimate is the tracking performance. With our fisheye cameras setup, itcan be seen from FIGS. 17A-17D that fast translation has little effecton the tracking performance due to the large field of view. However,fast rotation can blur the image easily, causing the failure of the KLTtracker. This observation motivates us to design trajectories thatminimize the angular velocities in roll and pitch.

By differentiating the equation of motion of a quadrotor [26], it can beseen that the angular velocity of the body frame is affinely related tothe jerk, the derivative of the linear acceleration. As such, wegenerate trajectories that minimize the jerk of the quadrotor inhorizontal directions.

For the vertical direction, we wish to minimize the RPM changes of themotors, which again correspond to the jerk. Intermediate waypoints areadded shortly before and after a waypoint if the angle between the twoline segments that connect this waypoint exceeds a threshold in order toavoid large deviations from the high level path. We utilize a polynomialtrajectory generation algorithm [27] that runs onboard the robot with aruntime on the order of 10 ms. Optimal trajectories can be found bysolving the following unconstrained quadratic programming:

$\begin{matrix}{\min\limits_{y}{y^{T}{Qy}}} & (8)\end{matrix}$

Where y is a collection of desired derivative values at each waypoint,which can be either free or fixed. We fix the position, velocity,acceleration, at the first waypoint to be current state of the robot inorder to maintain smooth trajectories during replanning and loopclosures. The velocity and acceleration are set to be zero for the lastwaypoint. For all other waypoints, only position is fixed and thetrajectory generator will provides the velocity and accelerationprofile. The coefficients of the polynomial trajectories s can be foundvia a linear mapping s=My.

A limitation of the above trajectory generation approach is thenecessity of predefining the travel time between waypoints. Due tocomputational constraints, we do not perform any iterative timeoptimization [27, 28] to find the optimal segment time, but rather use aheuristic that approximates the segment time as a linear trajectory thatalways accelerates from and decelerates to zero speed with a constantacceleration at the beginning and end of a segment, and maintainsconstant velocity in the middle of a segment. This simple heuristic canhelp avoid excessive accelerations during short segments, and is areasonable time approximation for long segments.

FIGS. 18A and 18B show in simulation a quadrotor tracking a smoothtrajectory generated from a sequence of waypoints. A change of waypointsand trajectory regeneration take place at 20 s. The regeneratedtrajectory smoothly connects to the initial trajectory and the quadrotoris able to smoothly switch waypoints.

VII. Control

A. Position Tracking Controller

For this work, we choose to use a position tracking controller with anonlinear error metric [29] due to its superior performance in highlydynamical motions that involve large angle changes and significantaccelerations. The 100 Hz state estimate from the VINS system (Sect.III) is used directly as the feedback for the controller. In ourimplementation, the attitude controller runs at 1 kHz on the ARMprocessor on the robot's AutoPilot board, while the position trackingcontrol operates at 100 Hz on the Atom processor.

B. Hybrid-System Controller

Although our goal is to develop a fully autonomous vehicle, at somepoint during the experiment, the human operator may wish to have simple,but direct control of the vehicle. As such, we developed a finite statemachine-based hybrid-system controller (FIG. 19) to allow human-robotinteraction. There are four modes in this controller, the controllerpresented in Sect. VILA operates in the position mode. At any time, theoperator is able to send inputs via a remote control. These commands areinterpreted by the vehicle as kinematic velocity commands (where nocommands result in hover state). We experimentally tested that thevelocity control mode is easy to use in the sense that an untrainedoperator is able to control the vehicle without direct line-of-sightusing only the 1 Hz images and the global 3D map. The hover mode servesas an idle state, where the vehicle is waiting for commands from theoperator.

VIII. Experimental Results

We present three representative experiments to demonstrate theperformance of the proposed system. The first experiment demonstratesthe ability of the proposed system to maintain globally consistenttracking. We provide a comparison with ground truth to quantify theperformance. In the second experiment, the robot navigates an indoorenvironment with a large loop (approximately 190 m) and completes theloop within one battery charge (less than 5 minutes of flight time).Finally, we present an outdoor navigation experiment that emphasizes therobustness of the proposed system against environment changes and strongwind disturbance.

A. Evaluation of System Performance with Ground Truth Comparison

In this experiment, the robot autonomously follows a smooth trajectorygenerated from a rectangle pattern at approximately 1 m/s. The groundtruth from Vicon is used to quantify the global tracking performance. Asseen from FIG. 20A and FIG. 20B, there is slow position drift in theVINS state estimate. However, global corrections from the offboardvisual SLAM results in a globally consistent operation. Note that therobot is controlled using the VINS state estimate, although global loopclosure is clearly being merged into the system. Due to the correctionfrom the visual SLAM, the desired smooth trajectory in the odometryframe regenerates and changes over time. It can be seen from FIG. 20Athat the actual position of the robot converges to the desired position,with a standard deviation of {σ_(x), σ_(y), σ_(z)}={0.1108, 0.1186,0.0834} (m), indicating global consistent tracking.

B. Navigation of Indoor Environments with Large Loops

We now consider a case where the robot autonomously navigates through alarge-scale environment with loops. Due to the size of the loop(approximately 190 m), and the short battery life cycle (less than 5min), we must achieve high-speed navigation in order to complete thetask. This environment poses significant challenges to approaches thatuses downward facing cameras [6, 7] due to the featureless floor (FIGS.21A and 21B). However, a reliable state estimate is obtained by theproposed system, and the robot successfully completes the experimentwith a maximum speed of over 1.5 m/s and an average speed of 1 m/s. Alarge-scale loop closure is detected at 257 s (FIG. 22[C), during whichboth the SLAM pose and the 3D map change significantly (FIGS. 22A-22B).However, as seen in FIG. 22C, the state estimate that is used forfeedback control of the robot remains smooth throughout the experimentand the robot is able to return to the global origin by following thetransformed waypoints in the odometry frame (Sect. VI).

C. Autonomous Navigation in Complex Outdoor Environments

This experiment demonstrates the performance of the proposed system inoutdoor environments. The experiment is conducted in a typical winterday at Philadelphia, Pa., where the wind speed goes up to 20 km/hr. Thetotal travel distance is approximately 170 m with a total duration of166 s (FIG. 23). Snapshots from the video camera and images captured bythe onboard camera are shown in FIGS. 24A-24D. Note that the outdoorenvironment is largely unstructured, consisting of trees and vegetation,demonstrating the ability of the system to also operate in unstructuredenvironments.

IX. Conclusion and Future Work

As described herein, we propose a system design that enables globallyconsistent autonomous navigation in complex environments with a lightweight, off-the-shelf quadrotor using only onboard cameras and an IMU assensors. We address the issue of maintaining smooth trajectory trackingduring challenging conditions such as sudden waypoint changes and loopclosure. Online experimental results in both indoor and outdoorenvironments are presented to demonstrate the performance of theproposed system.

An integrated laser- and/or GPS-based state estimation approach may beincorporated into our current system to extend the operationalenvironments and enhance the system robustness.

The disclosures of each of the following references in incorporatedherein by reference in its entirety.

REFERENCES

-   [1]. A. Bachrach, S. Prentice, R. He, and N, Roy, “RANGE-robust    autonomous navigation in gps-denied environments,” J. Field    Robotics, vol. 28, no. 5, pp, 644-666, 2011.-   [2]. S. Grzonka, G. Grisetti, and W. Burgard, “A fully autonomous    indoor quadrotor,” IEEE Trans. Robot., vol. PP, no. 99, pp. 1-11,    2011.-   [3]. S. Shen, N. Michael, and V. Kumar, “Autonomous multi-floor    indoor navigation with a computationally constrained MAV,” in Proc.    of the IEEE Intl. Conf on Robot. and Autom., Shanghai, China, May    2011, pp. 20-25.-   [4]. S. Scherer, J. Rehder, S. Achar, H. Cover, A, Chambers, S.    Nuske, and S. Singh, “River mapping from a flying robot: state    estimation, river detection, and obstacle mapping,” Auton. Robots,    vol. 33, no. 1-2, pp. 189-214, August 2012.-   [5]. A. Kushleyev, B. MacAllister, and M. Likhachev, “Planning for    landing site selection in the aerial supply delivery,” in Proc. of    the IEEE/RSJ Intl. Conf on Intell. Robots and Syst, San Francisco,    Calif., September 2011, pp, 1146-1153.-   [6]. S. Weiss, M. W. Achtelik, S. Lynen, M. Chli, and R. Siegwart,    “Real-time onboard visual-inertial state estimation and    self-calibration of mays in unknown environments,” in Proc. of the    IEEE Intl. Conf. on Robot. and Autom., Saint Paul, Minn., May 2012,    pp. 957-964.-   [7]. F. Fraundorfer, L. Heng, D. Honegger, G. H. Lee, L. Meier, P.    Tanskanen, and M. Pollefeys, “Vision-based autonomous mapping and    exploration using a quadrotor MAV,” in Proc. of the IEEE/RSJ Intl.    Conf on Intell. Robots and Syst., Vilamoura, Algarve, Portugal,    October 2012.-   [8]. A. S. Huang, A. Bachrach, P. Henry, M. Krainin, D. Maturana, D.    Fox, and N. Roy, “Visual odometry and mapping for autonomous flight    using an RGB-D camera,” in Proc. of the Intl. Sym, of Robot.    Research, Flagstaff, Ariz., August 2011,-   [9]. C. Bills, J. Chen, and A. Saxena, “Autonomous MAV flight in    indoor environments using single image perspective cues,” in Proc.    of the IEEE Intl. Conf. on Robot, and Autom, Shanghai, China, May    2011, pp. 5776-5783.-   [10]. G. de Croon, C. D. Wagterb, B. Remesb, and R. Ruijsinkb,    “Sub-sampling: Real-time vision for micro air vehicles,” Robot. and    Autom. Syst., vol. 60, no. 2, pp. 167-181, February, 2012.-   [11]. S. Shen, Y. Mulgaonlcar, N. Michael, and V. Kumar,    “Vision-based state estimation for autonomous rotorcraft MAVs in    complex environments,” in Proc. of the IEEE Intl. Conf on Robot. and    Autom, Karlsruhe, Germany, May 2013, To appear.-   [12]. E. Rosten and T. Drummond, “Machine learning for high-speed    corner detection,” in Proc. of the European Conf on Computer Vision,    Graz, Austria, May 2006.-   [13]. B. D. Lucas and T. Kanade, “An iterative image registration    technique with an application to stereo vision,” in Proc. of the    Intl. Joint Conf on Artificial Intelligence, Vancouver, Canada,    August 1981, pp. 24-28.-   [14]. R. V. D. Merwe, E. A. Wan, and S. I. Julier, “Sigma-point    Kalman filters for nonlinear estimation: Applications to integrated    navigation,” in Proc. of AIAA Guidance, Navigation, and Controls    Conf, Providence, R.I., August 2004.-   [15]. J. Civera, A. J. Davison, and J. Montiel, “Inverse depth    parameteriza-tion for monocular SLAM,” IEEE Trans. Robot, vol. 24,    no, 5, pp. 932-945, October 2008.-   [16]. G. Klein and D. Murray, “Parallel tracking and mapping for    small AR workspaces,” in Proc. Sixth IEEE and ACM International    Symposium on Mixed and Augmented Reality (ISMAR'07), Nara, Japan,    November 2007.-   [17]. H. Strasdat, J. M. M. Montiel, and A. J. Davison, “Scale    drift-aware large scale monocular SLAM,” in Proc. of Robot.: Sci.    and Syst., Zaragoza, Spain, June 2010.-   [18]. C. Mei, G. Sibley, M. Cummins, P. Newman, and I. Reid, “RSLAM:    A system for large-scale mapping in constant-time using stereo,”    Intl J. of Computer Vision, pp. 1-17, June 2010.-   [19]. H. Bay, T. Tuytelaars, and L. V. Gool, “SURF: Speeded up    robust features,” in Proc. of the European Conf on Computer Vision,    Graz, Austria, May 2006.-   [20]. F. Moreno-Noguer, V. Lepetit, and P. Fua, “Accurate    non-iterative 0(n) solution to the PnP problem,” in Proc. of the    IEEE Intl. Conf on Computer Vision, Rio de Janeiro, Brazil, October    2007.-   [21]. E. B. Olson, “Robust and efficient robotic mapping,” Ph.D.    dissertation, MIT, Cambridge, Mass., June 2008.-   [22]. M. Kaess, A. Ranganathan, and F. Dellaert, “iSAM: Incremental    smoothing and mapping,” IEEE Trans, Robot., vol. 24, no. 6, pp.    1365-1378, December 2008.-   [23]. R. Smith, M. Self, and P. Cheeseman, “Estimating uncertain    spatial relationships in robotics,” in Proc. of the IEEE Intl. Conf    on Robot. and Autom., Rayleigh, N.C., March, 1987, p. 850.-   [24]. S. Karaman and E. Frazzoli, “Incremental sampling-based    algorithms for optimal motion planning,” in Proc. of Robot: Sci, and    Syst., Zaragoza, Spain, June 2010.-   [25]. I. A. Sucan, M. Moll, and L. E. Kavraki, “The Open Motion    Planning Library,” IEEE Robot. Autom. Mag., vol. 19, no. 4, pp.    72-82, December 2012, http://ompl.kavrakilab.org.-   [26]. N. Michael, D. Mellinger, Q. Lindsey, and V. Kumar, “The GRASP    multiple micro UAV testbed,” IEEE Robot, Autoin. Mag., vol. 17, no.    3, pp. 56-65, September, 2010,-   [27]. C. Richter, A. Bry, and N. Roy, “Polynomial trajectory    planning for quadrotor flight,” in Proc. of the IEEE Intl, Conf on    Robot, and Autom, Karlsruhe, Germany, May 2013, To appear.-   [28]. D. Mellinger and V. Kumar, “Minimum snap trajectory generation    and control for quadrotors,” in Proc. of the IEEE Intl, Conf on    Robot, and Autom, Shanghai, China, May 2011, pp. 2520-2525.-   [29]. T. Lee, M. Leoky, and N. McClamroch, “Geometric tracking    control of a quadrotor uav on SE(3),” in Proc. of the Intl. Conf. on    Decision and Control, Atlanta, Ga., December 2010, pp. 5420-5425.

The subject matter described herein includes any combination of theelements or techniques described herein even if not expressly describedas a combination. For example, elements or methodologies described inthe section entitled Vision Based Autonomous Navigation in ComplexEnvironments with a Quadrotor can be combined with any of the methods orelements described prior to that section.

It will be understood that various details of the subject matterdescribed herein may be changed without departing from the scope of thesubject matter described herein. Furthermore, the foregoing descriptionis for the purpose of illustration only, and not for the purpose oflimitation.

1. A system that enables autonomous control of a vehicle in indoor andoutdoor environments, the system comprising: a sensor fusion module forcombining measurements from a plurality of sensors of differentmodalities to estimate a current state of the vehicle given current andprevious measurements from the sensors and a previous estimated state ofthe vehicle, wherein the sensor fusion module is configured to maintainsmoothness in the state estimates of the vehicle when: one or moresensors provide inaccurate information, when global positioning system(GPS) measurements are unavailable after a period of availability, orwhen GPS measurements become available after a period of unavailability;and a trajectory generator for generating a plan for controlling atrajectory of the vehicle based on the estimated current state and agoal or a waypoint input provided by either a user or a higher levelplanner.
 2. The system of claim 1 wherein the sensors include aninertial measurement unit (IMU), and at least one of a pressurealtimeter, a magnetometer, a laser scanner, a camera, a downward facingoptical sensor, and a global positioning system (GPS) receiver.
 3. Thesystem of claim 1 wherein the sensor fusion module is configured to usean Unscented Kalman Filter (UKF) to combine the measurements from thesensors of different modalities, enabling addition and removal ofsensors without reconfiguration of software of the sensor fusion module.4. The system of claim 3 wherein the sensor fusion module is configuredto estimate the current state using current relative measurements andcopies of augmented past states in the filter.
 5. The system of claim 3wherein the sensor fusion module is configured to judiciously removeaugmented states from the filter and add new augmented states to thefilter.
 6. The system of claim 3 wherein the sensor fusion module isconfigured to fuse measurements from the sensors that arrive out oforder to the filter.
 7. A method that enables autonomous control of avehicle in indoor and outdoor environments, the method comprising:combining measurements from a plurality of sensors of differentmodalities to generate an estimate of a current state of the vehiclegiven current measurements from the sensors and a previous estimatedstate of the vehicle; generating a signal for planning a trajectory ofthe vehicle based on the estimated current state and a goal or waypointinput by a user or a higher level planner; and smoothing changes instate of the vehicle when: output from one or more of the sensors isinaccurate, global positioning system (GPS) measurements becomeavailable after a period of unavailability, or GPS measurements becomeunavailable after a period of availability.
 8. The method of claim 7wherein the sensors include at least an inertial measurement unit (IMU)and at least one of a pressure altimeter, a magnetometer, a laserscanner, a camera, and a GPS receiver.
 9. The method of claim 7 whereincombining the measurements includes an Unscented Kalman Filter (UKF) tocombine the measurements from the sensors of different modalities,enabling addition and removal of sensors without reconfiguration of thesensor fusion module.
 10. The method of claim 9 wherein estimating thecurrent state includes using current relative measurement and copies ofaugmented past states in the filter.
 11. The method of claim 9comprising removing augmented states from the filter in response toaddition of a new augmented state with a binary selection matrixcorresponding to that of a previous augmented state.
 12. The method ofclaim 9 comprising fusing measurements from the sensors that arrive outof order at the filter.
 13. A non-transitory computer readable mediumhaving stored thereon executable instructions that when executed by theprocessor of a computer controls the computer to perform stepscomprising: combining measurements from a plurality of sensors ofdifferent modalities to generate an estimate of a current state of thevehicle given current measurements from the sensors and a previousestimated state of the vehicle; generating a signal for planning atrajectory of the vehicle based on the estimated current state and agoal or waypoint input by a user or a higher level planner; andsmoothing changes in state of the vehicle when: output from one or moreof the sensors is inaccurate, global positioning system (GPS)measurements become available after a period of unavailability, or GPSmeasurements become unavailable after a period of availability.
 14. Thesystem of claim 1 wherein the vehicle comprises a rotorcraftmicro-aerial vehicle (MAV).
 15. The method of claim 7 wherein thevehicle comprises a rotorcraft micro-aerial vehicle (MAV).
 16. Thenon-transitory computer readable medium of claim 13 wherein the vehiclecomprises a rotorcraft micro-aerial vehicle (MAV).